Utilizing Dynamics and Reward Models in Learning Strategy Fusion

نویسندگان

  • Akihiko Yamaguchi
  • Tsukasa Ogasawara
چکیده

Learning strategy (LS) fusion is our previous work in reinforcement learning (RL) framework. LS fusion fuses multiple LSs, such as transfer learning, for a single task of a robot. This paper introduces two LSs into LS fusion: an LS to learn a dynamics and a reward models, and an LS using a model-based RL method. Especially, we propose to use the MixFS dynamics model which is also our previous work. MixFS decomposes the dynamics model into the task specific elements and the task invariant elements. Thus, we can initialize the dynamics model of a task by transferring the one of the other task. In simulation experiments, we apply LS fusion with the new LSs to maze tasks of a small humanoid robot, where the primitive motions, crawling and turning, are also pre-learned by LS fusion. The results demonstrate that the new LSs improve the learning speed by using MixFS.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reinforcement learning based feedback control of tumor growth by limiting maximum chemo-drug dose using fuzzy logic

In this paper, a model-free reinforcement learning-based controller is designed to extract a treatment protocol because the design of a model-based controller is complex due to the highly nonlinear dynamics of cancer. The Q-learning algorithm is used to develop an optimal controller for cancer chemotherapy drug dosing. In the Q-learning algorithm, each entry of the Q-table is updated using data...

متن کامل

O2: Neuroscience and Talent: How Neuroscience Can Enhance Successful Plan of Talent Strategy

Performance and development are based on hard work, experience and learning. Learning how to change different behaviors is crucial to successful talent management plans. Within the brain there are complex connected circuits that can identify threats. The brain reacts to change as a threat. There is also a collection of brain structures tied to a natural reward system that are involved in the re...

متن کامل

Eecient Exploration for Optimizing Immediate Reward

We consider the problem of learning an eeective behavior strategy from reward. Although much studied, the issue of how to use prior knowledge to scale optimal behavior learning up to real-world problems remains an important open issue. We investigate the inherent data-complexity of behavior learning when the goal is simply to optimize immediate reward. Although easier than reinforcement learnin...

متن کامل

Efficient exploration for optimizing immediate reward

We consider the problem of learning an effective behavior strategy from reward. Although much studied, the issue of how to use prior knowledge to scale optimal behavior learning up to real-world problems remains an important open issue. We investigate the inherent data-complexity of behavior-learning when the goal is simply to optimize immediate reward. Although easier than reinforcement learni...

متن کامل

Explanation of Socratic dialectic aspects and teaching method: A strategy for improving the schools' teaching-learning process

The main purpose of this research is explanation of Socratic dialectic aspects and teaching method as a strategy for improving the schools' teaching-learning process. In this order, with a qualitative method in kind of descriptive-analytic (document analysis), firstly, Socratic dialectic and then the philosophical foundations of Socratic teaching method are described. Then, the feasibility stud...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011